Beautiful Soup is a Python package for parsing HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after Tag soup). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.〔(【引用サイトリンク】url=http://www.crummy.com/software/BeautifulSoup/ )〕 It is available for Python 2.6+ and Python 3. ==Code example== # anchor extraction from html document from bs4 import BeautifulSoup import urllib2 webpage = urllib2.urlopen('http://en.wikipedia.org/wiki/Main_Page') soup = BeautifulSoup(webpage) for anchor in soup.find_all('a'): print(anchor.get('href', '/'))